31 research outputs found

    Language independent evaluation of translation style and consistency:comparing human and machine translations of Camus’ novel “The Stranger”

    Get PDF
    We present quantitative and qualitative results of automatic and manual comparisons of translations of the originally French novel “The Stranger” (French: L’Étranger). We provide a novel approach to evaluating translation performance across languages without the need for reference translations or comparable corpora. Our approach examines the consistency of the translation of various document levels including chapters, parts and sentences. In our experiments we analyse four expert translations of the French novel. We also used Google’s machine translation output as baselines. We analyse the translations by using readability metrics, rank correlation comparisons and Word Error Rate (WER)

    Semi-supervised prediction of protein interaction sentences exploiting semantically encoded metrics

    Get PDF
    Protein-protein interaction (PPI) identification is an integral component of many biomedical research and database curation tools. Automation of this task through classification is one of the key goals of text mining (TM). However, labelled PPI corpora required to train classifiers are generally small. In order to overcome this sparsity in the training data, we propose a novel method of integrating corpora that do not contain relevance judgements. Our approach uses a semantic language model to gather word similarity from a large unlabelled corpus. This additional information is integrated into the sentence classification process using kernel transformations and has a re-weighting effect on the training features that leads to an 8% improvement in F-score over the baseline results. Furthermore, we discover that some words which are generally considered indicative of interactions are actually neutralised by this process

    Stance Detection in Web and Social Media: A Comparative Study

    Full text link
    Online forums and social media platforms are increasingly being used to discuss topics of varying polarities where different people take different stances. Several methodologies for automatic stance detection from text have been proposed in literature. To our knowledge, there has not been any systematic investigation towards their reproducibility, and their comparative performances. In this work, we explore the reproducibility of several existing stance detection models, including both neural models and classical classifier-based models. Through experiments on two datasets -- (i)~the popular SemEval microblog dataset, and (ii)~a set of health-related online news articles -- we also perform a detailed comparative analysis of various methods and explore their shortcomings. Implementations of all algorithms discussed in this paper are available at https://github.com/prajwal1210/Stance-Detection-in-Web-and-Social-Media

    Modeling the Structure and Dynamics of Semantic Processing

    Get PDF
    The contents and structure of semantic memory have been the focus of much recent research, with major advances in the development of distributional models, which use word co-occurrence information as a window into the semantics of language. In parallel, connectionist modeling has extended our knowledge of the processes engaged in semantic activation. However, these two lines of investigation have rarely been brought together. Here, we describe a processing model based on distributional semantics in which activation spreads throughout a semantic network, as dictated by the patterns of semantic similarity between words. We show that the activation profile of the network, measured at various time points, can successfully account for response times in lexical and semantic decision tasks, as well as for subjective concreteness and imageability ratings. We also show that the dynamics of the network is predictive of performance in relational semantic tasks, such as similarity/relatedness rating. Our results indicate that bringing together distributional semantic networks and spreading of activation provides a good fit to both automatic lexical processing (as indexed by lexical and semantic decisions) as well as more deliberate processing (as indexed by ratings), above and beyond what has been reported for previous models that take into account only similarity resulting from network structure

    Frame-semantic parsing

    Get PDF
    Frame semantics is a linguistic theory that has been instantiated for English in the FrameNet lexicon. We solve the problem of frame-semantic parsing using a two-stage statistical model that takes lexical targets (i.e., content words and phrases) in their sentential contexts and predicts frame-semantic structures. Given a target in context, the first stage disambiguates it to a semantic frame. This model uses latent variables and semi-supervised learning to improve frame disambiguation for targets unseen at training time. The second stage finds the target's locally expressed semantic arguments. At inference time, a fast exact dual decomposition algorithm collectively predicts all the arguments of a frame at once in order to respect declaratively stated linguistic constraints, resulting in qualitatively better structures than naïve local predictors. Both components are feature-based and discriminatively trained on a small set of annotated frame-semantic parses. On the SemEval 2007 benchmark data set, the approach, along with a heuristic identifier of frame-evoking targets, outperforms the prior state of the art by significant margins. Additionally, we present experiments on the much larger FrameNet 1.5 data set. We have released our frame-semantic parser as open-source software.United States. Defense Advanced Research Projects Agency (DARPA grant NBCH-1080004)National Science Foundation (U.S.) (NSF grant IIS-0836431)National Science Foundation (U.S.) (NSF grant IIS-0915187)Qatar National Research Fund (NPRP 08-485-1-083

    Logical Metonymy Resolution in a Words-as-Cues Framework: Evidence From Self-Paced Reading and Probe Recognition

    No full text
    Logical metonymy resolution (begin a book -> begin reading a book or begin writing a book) has traditionally been explained either through complex lexical entries (qualia structures) or through the integration of the implicit event via post-lexical access to world knowledge. We propose that recent work within the words-as-cues paradigm can provide a more dynamic model of logical metonymy, accounting for early and dynamic integration of complex event information depending on previous contextual cues (agent and patient). We first present a self-paced reading experiment on German subordinate sentences, where metonymic sentences and their paraphrased version differ only in the presence or absence of the clause-final target verb (Der Konditor begann die Glasur -> Der Konditor begann, die Glasur aufzutragen /The baker began the icing -> The baker began spreading the icing ). Longer reading times at the target verb position in a high-typicality condition (baker + icing -> spread) compared to a low-typicality (but still plausible) condition (child + icing -> spread) suggest that we make use of knowledge activated by lexical cues to build expectations about events. The early and dynamic integration of event knowledge in metonymy interpretation is bolstered by further evidence from a second experiment using the probe recognition paradigm. Presenting covert events as probes following a high-typicality or a low-typicality metonymic sentence (Der Konditor begann die Glasur -> AUFTRAGEN /The baker began the icing -> SPREAD), we obtain an analogous effect of typicality at 100 ms interstimulus interval

    Generative and Discriminative Learning in Semantic Role Labeling for Italian

    No full text

    Design and Compilation of Syntactically Tagged Corpus of Japanese Statutory Sentences

    No full text

    Comparing Different Properties Involved in Word Similarity Extraction ⋆

    No full text
    Abstract. In this paper, we will analyze the behavior of several parameters, namely type of contexts, similarity measures, and word space models, in the task of word similarity extraction from large corpora. The main objective of the paper will be to describe experiments comparing different extraction systems based on all possible combinations of these parameters. Special attention will be paid to the comparison between syntax-based contexts and windowing techniques, binary similarity metrics and more elaborate coefficients, as well as baseline word space models and Singular Value Decomposition strategies. The evaluation leads us to conclude that the combination of syntax-based contexts, binary similarity metrics, and a baseline word space model makes the extraction much more precise than other combinations with more elaborate metrics and complex models.
    corecore